AITopics

2606.28123

Country: Europe > France (0.28)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

arXiv.org Machine LearningJun-25-2026

A Bregman Perspective on Classification and Regression Trees

Bourel, Mathias

Classification and Regression Trees (CART) constitute one of the most influential paradigms in statistical learning. Although a variety of impurity measures have been proposed for different statistical models, these criteria are typically introduced on a case-by-case basis and analyzed separately. In this paper, we study CART through the lens of Bregman divergences. This perspective places the classical least-squares criterion, Poisson deviance, Kullback-Leibler-type losses, and other impurity measures associated with exponential-family models within a common framework. As a result, key ingredients of the CART methodology -- including node representatives, impurity measures, and split selection rules -- can be expressed and analyzed through general properties of convex functions rather than through separate model-specific constructions. Beyond the algorithmic formulation, we investigate theoretical properties of Bregman-based CART procedures. In particular, we analyze how geometric properties of the generating convex function influence impurity reductions and stability of recursive partitions. We also establish consistency results within the proposed framework, providing a unified theoretical treatment for a broad family of CART type procedures. Our results provide a geometric interpretation of impurity-based tree construction and show that many classical CART impurity criteria admit a common interpretation within a Bregman framework.

artificial intelligence, decision tree learning, machine learning, (16 more...)

2606.13984

Country: Asia (0.14)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Neural Information Processing SystemsJun-20-2026, 08:41:07 GMT

Deep Legendre Transform

We introduce a novel deep learning algorithm for computing convex conjugates of differentiable convex functions, a fundamental operation in convex analysis with various applications in different fields such as optimization, control theory, physics and economics. While traditional numerical methods suffer from the curse of dimensionality and become computationally intractable in high dimensions, more recent neural network-based approaches scale better, but have mostly been studied with the aim of solving optimal transport problems and require the solution of complicated optimization or max-min problems. Using an implicit Fenchel formulation of convex conjugation, our approach facilitates an efficient gradient-based framework for the minimization of approximation errors and, as a byproduct, also provides a posteriori estimates of the approximation accuracy. Numerical experiments demonstrate our method's ability to deliver accurate results across different high-dimensional examples. Moreover, by employing symbolic regression with Kolmogorov-Arnold networks, it is able to obtain the exact convex conjugates of specific convex functions.

artificial intelligence, legendre transform, machine learning, (17 more...)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Neural Information Processing SystemsJun-18-2026, 18:57:15 GMT

Fast Zeroth-Order Convex Optimization with Quantum Gradient Methods

We study quantum algorithms based on quantum (sub)gradient estimation using noisy function evaluation oracles, and demonstrate the first dimension-independent query complexities (up to poly-logarithmic factors) for zeroth-order convex optimization in both smooth and nonsmooth settings. Interestingly, only using noisy function evaluation oracles, we match the first-order query complexities of classical gradient descent, thereby exhibiting exponential separation between quantum and classical zeroth-order optimization. We then generalize these algorithms to work in non-Euclidean settings by using quantum (sub)gradient estimation to instantiate mirror descent and its variants, including dual averaging and mirror prox. By leveraging a connection between semidefinite programming and eigenvalue optimization, we use our quantum mirror descent method to give a new quantum algorithm for solving semidefinite programs, linear programs, and zero-sum games. We identify a parameter regime in which our zero-sum games algorithm is faster than any existing classical or quantum approach.

artificial intelligence, machine learning, natural language, (18 more...)

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Banking & Finance (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

Neural Information Processing SystemsJun-14-2026, 10:32:32 GMT

Gradient-Variation Online Adaptivity for Accelerated Optimization with Hölder Smoothness

Smoothness is known to be crucial for acceleration in offline optimization, and for gradient-variation regret minimization in online learning. Interestingly, these two problems are actually closely connected -- accelerated optimization can be understood through the lens of gradient-variation online learning. In this paper, we investigate online learning with Hölder smooth functions, a general class encompassing both smooth and non-smooth (Lipschitz) functions, and explore its implications for offline optimization.

machine learning, natural language, optimization, (19 more...)

Country: Asia (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry: Education > Educational Setting > Online (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.76)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

Neural Information Processing SystemsJun-13-2026, 10:56:31 GMT

Deep Legendre Transform

artificial intelligence, deep legendre transform aleksey minabutdinov, machine learning, (8 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

arXiv.org Machine LearningMay-29-2026

Saddle Networks: Structure-Preserving Architectures for Convex-Concave Functions

Warin, Xavier

Saddle-point models arise throughout optimization, optimal transport, robust learning, and control. In many applications, the relevant function f(x,y) is convex in x and concave in y, and preserving this geometry is essential for obtaining tractable min--max formulations and reliable certificates. We introduce a structured separable decomposition that preserves the convex-concave geometry and prove a complete one-dimensional approximation theorem under a mixed Monge-type convexity condition. We then describe practical saddle network architectures that preserve convexity in x and concavity in y by construction. The proposed architectures require only convexity-preserving neural networks, together with simple output transformations enforcing sign and concavity constraints. Finally, we report numerical benchmarks in dimension 1 and 5, showing that the proposed saddle networks achieve high accuracy on smooth, nonsmooth, and high-rank convex--concave test functions.

artificial intelligence, convex, machine learning, (17 more...)

2605.28894

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Neural Information Processing SystemsMay-1-2026, 02:27:18 GMT

Horospherical Decision Boundaries for Large Margin Classification in Hyperbolic Space

Hyperbolic spaces have been quite popular in the recent past for representing hierarchically organized data. Further, several classification algorithms for data in these spaces have been proposed in the literature. These algorithms mainly use either hyperplanes or geodesics for decision boundaries in a large margin classifiers setting leading to a non-convex optimization problem. In this paper, we propose a novel large margin classifier based on horospherical decision boundaries that leads to a geodesically convex optimization problem that can be optimized using any Riemannian gradient descent technique guaranteeing a globally optimal solution.

artificial intelligence, hyperbolic space, machine learning, (16 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.68)

Hundrieser, Shayan, Kong, Insung, Schmidt-Hieber, Johannes

Hyper Input Convex Neural Networks for Shape Constrained Learning and Optimal Transport

arXiv.org Machine LearningApr-30-2026

We introduce Hyper Input Convex Neural Networks (HyCNNs), a novel neural network architecture designed for learning convex functions. HyCNNs combine the principles of Maxout networks with input convex neural networks (ICNNs) to create a neural network that is always convex in the input, theoretically capable of leveraging depth, and performs reliable when trained at scale compared to ICNNs. Concretely, we prove that HyCNNs require exponentially fewer parameters than ICNNs to approximate quadratic functions up to a given precision. Throughout a series of synthetic experiments, we demonstrate that HyCNNs outperform existing ICNNs and MLPs in terms of predictive performance for convex regression and interpolation tasks. We further apply HyCNNs to learn high-dimensional optimal transport maps for synthetic examples and for single-cell RNA sequencing data, where they oftentimes outperform ICNN-based neural optimal transport methods and other baselines across a wide range of settings.

artificial intelligence, hycnn, machine learning, (19 more...)

2604.26942

Country: North America > United States (0.45)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Neural Information Processing SystemsApr-29-2026, 23:05:54 GMT

Parallel Submodular Function Minimization

We consider the parallel complexity of submodular function minimization (SFM). We provide a pair of methods which obtain two new query versus depth tradeoffs a submodular function defined on subsets of n elements that has integer values between M and M. The first method has depth 2 and query complexity

artificial intelligence, machine learning, natural language, (17 more...)

Country:

Europe (0.68)
North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.89)